Graph-based Document Expansion and Robust SCR Models for False Positives: Experiments at the NTCIR-12 SpokenQuery&Doc-2
نویسندگان
چکیده
In this paper, we report our experiments at NTCIR-12 Spoken Query&Doc-2 task. We participated spoken query driven spoken content retrieval (SQ-SCR) subtasks of Spoken Query&Doc2. We submited two types of results, which are conventional spoken content retrieval method (referred to as C-SCR) and STD based approach for SCR (referred to as STD-SCR). The latter was proposed in order to deal with speech recognition errors and out-of-vocabulary (OOV) words. We extend each SCR methods by several ways. For C-SCR, we applied graph-based document expansion method. For STD-SCR, we applied robust retrieval models for false positive errors by using word co-occurrences information.
منابع مشابه
STD Score Combination with Acoustic Likelihood and Robust SCR Models for False Positives: Experiments at NTCIR-11 SpokenQuery&Doc
In this paper, we report our experiments at NTCIR-11 SpokenQuery&Doc task [1]. We participated both the STD and SCR subtasks of SpokenDoc. For STD subtask, We try to improve detection accuracy by combining the DTW distance between syllable sequences and the acoustic likelihood of the detected speech segment. The final combined score, which is obtained by applying logistic regression on the, was...
متن کاملSpoken Document Retrieval Experiments for SpokenQuery&Doc at Ryukoku University (RYSDT)
In this paper, we describe spoken document retrieval (SDR) systems in Ryukoku University, which were participated in NTCIR-11 “SpokenQuery&Doc” task. In NTCIR-11 SpokenQuery&Doc task, there are subtasks: “spoken content retrieval (SCR) subtask” and “spoken term detection (STD) subtask”. We participated in the SCR and STD subtasks as team RYSDT. In this paper, our SDR and STD systems are described.
متن کاملOverview of the NTCIR-12 SpokenQuery&Doc-2 Task
This paper presents an overview of the Spoken Query and Spoken Document retrieval (SpokenQuery&Doc-2) task at the NTCIR-12 Workshop. This task included spoken query driven spoken content retrieval (SQ-SCR) and a spoken query driven spoken term detection (SQ-STD) as the two subtasks. The paper describes details of each sub-task, the data used, the creation of the speech recognition systems used ...
متن کاملDCU at the NTCIR-11 SpokenQuery&Doc Task
We describe DCU’s participation in the NTCIR-11 SpokenQuery&Document task. We participated in the spokenquery spoken content retrieval (SQ-SCR) subtask by using the slide group segments as basic indexing and retrieval units. Our approach integrates normalised prosodic features into a standard BM25 weighting function to increase weights for terms that are prominent in speech. Text queries and re...
متن کاملDCU at the NTCIR-12 SpokenQuery&Doc-2 Task
We describe DCU’s participation in the NTCIR-12 SpokenQuery&Doc (SQD-2) task. In the context of the slide-group retrieval sub-task, we experiment with a passage retrieval method that re-scores each passage according to the relevance score of the document from which the passage is taken. This is performed by linearly interpolating their relevance scores which are calculated using the Okapi BM25 ...
متن کامل